Picture for Huazheng Wang

Huazheng Wang

Eugene

Divide, Optimize, Merge: Fine-Grained LLM Agent Optimization at Scale

Add code
May 06, 2025
Viaarxiv icon

Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems

Add code
Apr 30, 2025
Viaarxiv icon

Erasing Without Remembering: Safeguarding Knowledge Forgetting in Large Language Models

Add code
Feb 27, 2025
Figure 1 for Erasing Without Remembering: Safeguarding Knowledge Forgetting in Large Language Models
Figure 2 for Erasing Without Remembering: Safeguarding Knowledge Forgetting in Large Language Models
Figure 3 for Erasing Without Remembering: Safeguarding Knowledge Forgetting in Large Language Models
Figure 4 for Erasing Without Remembering: Safeguarding Knowledge Forgetting in Large Language Models
Viaarxiv icon

Memory-Augmented Agent Training for Business Document Understanding

Add code
Dec 17, 2024
Figure 1 for Memory-Augmented Agent Training for Business Document Understanding
Figure 2 for Memory-Augmented Agent Training for Business Document Understanding
Figure 3 for Memory-Augmented Agent Training for Business Document Understanding
Figure 4 for Memory-Augmented Agent Training for Business Document Understanding
Viaarxiv icon

RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning

Add code
Oct 31, 2024
Viaarxiv icon

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling

Add code
Oct 18, 2024
Figure 1 for TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
Figure 2 for TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
Figure 3 for TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
Figure 4 for TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
Viaarxiv icon

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement

Add code
Oct 17, 2024
Viaarxiv icon

Conversational Dueling Bandits in Generalized Linear Models

Add code
Jul 26, 2024
Viaarxiv icon

Contractual Reinforcement Learning: Pulling Arms with Invisible Hands

Add code
Jul 02, 2024
Viaarxiv icon

LLM-RankFusion: Mitigating Intrinsic Inconsistency in LLM-based Ranking

Add code
May 31, 2024
Viaarxiv icon